part-whole hierarchy
- Asia (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
PTR: A Benchmark for Part-based Conceptual, Relational, and Physical Reasoning
A critical aspect of human visual perception is the ability to parse visual scenes into individual objects and further into object parts, forming part-whole hierarchies. Such composite structures could induce a rich set of semantic concepts and relations, thus playing an important role in the interpretation and organization of visual signals as well as for the generalization of visual perception and reasoning. However, existing visual reasoning benchmarks mostly focus on objects rather than parts. Visual reasoning based on the full part-whole hierarchy is much more challenging than object-centric reasoning due to finer-grained concepts, richer geometry relations, and more complex physics. Therefore, to better serve for part-based conceptual, relational and physical reasoning, we introduce a new large-scale diagnostic visual reasoning dataset named PTR.
- Asia (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
SynDaCaTE: A Synthetic Dataset For Evaluating Part-Whole Hierarchical Inference
Levi, Jake, van der Wilk, Mark
Learning to infer object representations, and in particular part-whole hierarchies, has been the focus of extensive research in computer vision, in pursuit of improving data efficiency, systematic generalisation, and robustness. Models which are \emph{designed} to infer part-whole hierarchies, often referred to as capsule networks, are typically trained end-to-end on supervised tasks such as object classification, in which case it is difficult to evaluate whether such a model \emph{actually} learns to infer part-whole hierarchies, as claimed. To address this difficulty, we present a SYNthetic DAtaset for CApsule Testing and Evaluation, abbreviated as SynDaCaTE, and establish its utility by (1) demonstrating the precise bottleneck in a prominent existing capsule model, and (2) demonstrating that permutation-equivariant self-attention is highly effective for parts-to-wholes inference, which motivates future directions for designing effective inductive biases for computer vision.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- North America > Canada (0.04)
- Europe > Finland (0.04)
PTR: A Benchmark for Part-based Conceptual, Relational, and Physical Reasoning
A critical aspect of human visual perception is the ability to parse visual scenes into individual objects and further into object parts, forming part-whole hierarchies. Such composite structures could induce a rich set of semantic concepts and relations, thus playing an important role in the interpretation and organization of visual signals as well as for the generalization of visual perception and reasoning. However, existing visual reasoning benchmarks mostly focus on objects rather than parts. Visual reasoning based on the full part-whole hierarchy is much more challenging than object-centric reasoning due to finer-grained concepts, richer geometry relations, and more complex physics. Therefore, to better serve for part-based conceptual, relational and physical reasoning, we introduce a new large-scale diagnostic visual reasoning dataset named PTR.
Neural Models for Part-Whole Hierarchies
We present a connectionist method for representing images that ex(cid:173) plicitly addresses their hierarchical nature. It blends data from neu(cid:173) roscience about whole-object viewpoint sensitive cells in inferotem(cid:173) poral cortex8 and attentional basis-field modulation in V43 with ideas about hierarchical descriptions based on microfeatures.5,11 The resulting model makes critical use of bottom-up and top-down pathways for analysis and synthesis.6 We illustrate the model with a simple example of representing information about faces. Images of objects constitute an important paradigm case of a representational hi(cid:173) erarchy, in which'wholes', such as faces, consist of'parts', such as eyes, noses and mouths.
Are We Witnessing the Next Evolution of Artificial Intelligence?
Geoffrey Hinton, a British computer scientist, who has spent his entire career pushing the field of artificial intelligence forward (AI), is a pioneer of Artificial Intelligence and the 2018 Turing Award, winner. After graduating from the University of Cambridge in 1970 with a BA in experimental psychology, Mr. Hinton joined the graduate program in artificial intelligence at the University of Edinburgh, with neural networks as his focus. He is currently split between Google Brain (the division dedicated to artificial intelligence research) and the University of Toronto, where he is working to provide artificial intelligence based on deep learning with intuition. For over 30 years, Geoffrey Hinton hovered at the edges of artificial intelligence research, an outsider clinging to a simple proposition: that computers could think like humans do -- using intuition rather than rules. I recently wrote an article about intuition, the ability to recognize similarities quickly.
- North America > Canada > Ontario > Toronto (0.57)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.25)
Active Predictive Coding Networks: A Neural Solution to the Problem of Learning Reference Frames and Part-Whole Hierarchies
Gklezakos, Dimitrios C., Rao, Rajesh P. N.
We introduce Active Predictive Coding Networks (APCNs), a new class of neural networks that solve a major problem posed by Hinton and others in the fields of artificial intelligence and brain modeling: how can neural networks learn intrinsic reference frames for objects and parse visual scenes into part-whole hierarchies by dynamically allocating nodes in a parse tree? APCNs address this problem by using a novel combination of ideas: (1) hypernetworks are used for dynamically generating recurrent neural networks that predict parts and their locations within intrinsic reference frames conditioned on higher object-level embedding vectors, and (2) reinforcement learning is used in conjunction with backpropagation for end-to-end learning of model parameters. The APCN architecture lends itself naturally to multi-level hierarchical learning and is closely related to predictive coding models of cortical function. Using the MNIST, Fashion-MNIST and Omniglot datasets, we demonstrate that APCNs can (a) learn to parse images into part-whole hierarchies, (b) learn compositional representations, and (c) transfer their knowledge to unseen classes of objects. With their ability to dynamically generate parse trees with part locations for objects, APCNs offer a new framework for explainable AI that leverages advances in deep learning while retaining interpretability and compositionality.
- North America > United States > Washington > King County > Seattle (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Europe > France (0.04)
- Law > Litigation (0.83)
- Health & Medicine > Therapeutic Area > Neurology (0.68)